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Abstract 

The von Neumann algorithm is a simple coordinate-descent algo¬ 
rithm to determine whether the origin belongs to a polytope generated 
by a finite set of points. When the origin is in the interior of the 
polytope, the algorithm generates a sequence of points in the polytope 
that converges linearly to zero. The algorithm’s rate of convergence 
depends on the radius of the largest ball around the origin contained 
in the polytope. 

We show that under the weaker condition that the origin is in the 
polytope, possibly on its boundary, a variant of the von Neumann 
algorithm that includes away steps generates a sequence of points in 
the polytope that converges linearly to zero. The new algorithm’s 
rate of convergence depends on a certain geometric parameter of the 
polytope that extends the above radius but is always positive. Our 
linear convergence result and geometric insights also extend to a variant 
of the Frank-Wolfe algorithm with away steps for minimizing a convex 
quadratic function over a polytope. 
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1 Introduction 


Assume A=\a\ ■ ■ ■ a n ] £ M mxn with ||a*|| 2 = 1, i = 1,..., n. The 

von Neumann algorithm, communicated by von Neumann to Dantzig 
in the late 1940s and discussed later by Dantzig in an unpublished 
manuscript [7] , is a simple algorithm to solve the feasibility problem: 

Is 0 € conv(A) = convjai,..., a„}? 

More precisely, the algorithm aims to find an approximate solution to 
the problem 


Ax = 0, x £ A„_i ={i£ R" : ||x||i = 1}. (1) 

The algorithm starts from an arbitrary point Xq £ A„_i. At the fc-th 
iteration the algorithm updates the current trial solution Xk £ A„_i 
as follows. First, it finds the column aj of A that forms the widest 
angle with y k := Ax k - If this angle is acute, i.e., A T y k > 0, then the 
algorithm halts as the vector yk separates the origin from conv(A). 
Otherwise the algorithm chooses Xk+i £ A„_i so that yk+i '■= Axk+i 
is the minimum-norm convex combination of Axk and a,j. Let ej £ 
A„_i denote the n-dimensional vector with j- th component equal to 
one and all other components equal to zero. To ease notation, we shall 
write || • || for || • || 2 throughout the paper. 

Von Neumann Algorithm 

1. pick xq £ A n _i; put j/o : = Ax 0 ; k := 0. 

2. for A: = 0,1,2,... 

if A T yk > 0 then HALT: 0 ^ conv(A) 
j := argmin (ai,y fe ); 

e k := argmin{||y fc + 0(a,j - y k )||}; 
fle[o,i] 

■— (1 Qk)%k T ^k^ji 2 /fc+l • Ax k - (- 1 , 

end for 

The von Neumann algorithm can be seen as a kind of coordinate- 
descent method for finding a solution to ©: At each iteration the 
algorithm judiciously selects a coordinate j and increases the weight 
of the j-th component of x k while decreasing all of the others via a line- 
search step. Like other currently popular coordinate-descent and first- 
order methods for convex optimization, the main attractive features of 
the von Neumann algorithm are its simplicity and low computational 
cost per iteration. Another attractive feature is its convergence rate. 
Epelman and Freund |SJ showed that the speed of convergence of the 
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von Neumann algorithm can be characterized in terms of the following 
condition measure of the matrix A: 


p(A) := max min (ai,z). (2) 

zGR m ,||z||=li=l,...,n 

The condition measure p(A) was introduced by Gofhn |T3] and later 
independently studied by Cheung and Cucker [2]. The latter set of 
authors showed that |p(^4)| is also a certain distance to ill-posedness in 
the spirit introduced and developed by Renegar EHEg. 

Observe that p(A) can also be written as 

p(A)= max min (A t z,v) = max min (z.Av). (3) 

zGR m ,||z||=lueA„_i ' 1 zeR m ,||z|| = luGA„_i 


Hence p{A) > 0 if and only if 0 ^ conv(H) and p(A) < 0 if and only 
if 0 G int(conv(A)). When p(A) > 0, this condition measure is closely 
related to the concept of margin in binary classification |25j and with 
the minimum enclosing ball problem in computational geometry [6;. 
The quantity p(A) also has the following geometric interpretation as 
discussed in [31 Proposition 6.28]. If p{A) > 0 then from © and 
Lagrangian duality we get 


p{A) = 


max min (z, Av) 

zGR m ,\\z\\<l A n _i 

min max (z, Av) 

zeR m ,\\z\\<l 


= min{||j/|| : y £ conv(H)} 

= dist(0,9conv(H)). 


On the other hand, if p(A) < 0 then J3]) yields 


\ P {A)\ = —p(A) 

= min max (z, Av) 

zGR m ,||z|| = l ueA„_i 

= max{r : ||y|| < r => y G conv(H)} 
= dist(0,9conv(H)). 


(4) 


(5) 


In either case |p(^4)| = dist(0,9conv(H)). Furthermore, observe that 
under the assumption A = [a\ ■ ■ ■ a n ] 6 ® m >< n with ||aj|| = 1, i = 

1 ,...,n it follows that \{z,Av)\ < 1 for all z S K m ,||z|| = 1 and 
v € A„_i. In particular, from ([3| it follows that \p(A)\ < 1. 

Epelman and Freund [8] showed the following properties of the von 
Neumann algorithm. When p(A) < 0 the algorithm generates iterates 
Xk £ A„_i, k = 1,2,... such that 


||^ fe || 2 <(l-p(H) 2 ) fc ||^ 0 || 2 . (6) 

On the other hand, the iterates Xk £ A n _i also satisfy ||Ha;fc|| 2 < \ 
as long as the algorithm has not halted. In particular, if p{A) > 0 
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then by ([4]) the algorithm must halt with a certificate of infeasibility 
A T y k > 0 for 0 £ conv(H) in at most p ^\yi iterations. The latter 
bound is identical to a classical convergence bound for the perceptron 
algorithm muni. This is not a coincidence as there is a nice duality 
between the perceptron and the von Neumann algorithms [19j [23]. 

We show that a variant of the von Neumann algorithm with away 
steps has the following stronger convergence properties. When 0 £ 
conv(H), possibly on its boundary, the algorithm generates a sequence 
x k £ A n _i, k = 1,2,... satisfying 

\\Ax k f < 7 \\Ax 0 \\ 2 . (7) 


The quantity w{A) is a kind of relative width of conv(H) that is at least 
as large as |p(^4)|. However, unlike \p(A)\ the relative width w(A) is 


positive for any non-zero matrix A £ 


provided 0 £ conv(H). 


When p(A) > 0, or equivalently 0 ^ conv(H), the von Neumann algo¬ 
rithm with away steps finds a certificate of infeasibility A T yk > 0 for 
0 ^ conv(H) in at most iterations. 

Figure [l] illustrates the different behavior of the von Neumann al¬ 
gorithm and the variant with away steps described in Section [2] for A = 

The figure depicts the path of iterates {yk : k = 0,1,...} 


1 0 
0 -1 


generated by each algorithm starting from yo = 


The zig-zagging 


behavior in the first case occurs because after the third iteration the 
search direction is nearly perpendicular to the current iterate and as 
a consequence the algorithm makes slow progress. By contrast, in the 
second case the away steps provide alternative search directions that 
enable the algorithm to make faster progress. 

The von Neumann algorithm can be seen as a special case of the 
Frank-Wolfe (also known as conditional gradient) algorithm [9, TB1 . 
The von Neumann algorithm is also nearly identical to an algorithm for 
minimizing a quadratic form over a convex set independently developed 
by Gilbert [12] , The name “Gilbert’s algorithm” appears to be more 
popular in the computational geometry literature m- 

We show that a linear convergence result similar to 0 also holds 
for a version of the Frank-Wolfe algorithm with away steps for min¬ 
imizing a strongly convex quadratic function over a polytope. This 
variant of the Frank-Wolfe algorithm with away steps was introduced 
by Wolfe [26] and has been subsequently studied by various authors. 
In particular, linear convergence results similar to ours have been pre¬ 
viously established in uni usi m on and more recently in [Tj. Linear 
convergence results in the same spirit also hold for the randomized 
Kaczmarz algorithm |24| and for the methods of randomized coordinate 
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Figure 1: Iterates generated by the von Neumann algorithm (top) and its 
variant with away steps (bottom) 


descent and iterated projections [18]. The computational article [T4] 
also reports numerical experiments for variants of the von Neumann 
algorithm with away steps. Our main contributions are the succinct 
and transparent proofs of linear convergence results that highlight the 
role of the relative width w(A) and a closely related restricted width 
4>(A). Our presentation unveils a deep connection between problem 
conditioning as encompassed by the quantities w(A ), <p(A) and the be¬ 
havior of the von Neumann and Frank-Wolfe algorithms with away 
steps. We also provide some lower bounds on w(A) and 4>{A) in terms 
of certain radii quantities that naturally extend p(A). We note that 
the linear convergence results in m are stated in terms of a certain 
pyramidal width whose geometric intuition and properties appear to be 
less understood than those of w(A) and 4>(A). 

The rest of the paper is organized as follows. In Section [2] we 
describe a von Neumann Algorithm with Away Steps and establish its 
main convergence result in terms of the relative width w(A). Section [3] 
extends our main result to the more general problem of minimizing 
a quadratic function over the polytope conv(A). Finally, Section |4] 
discusses some properties of the relative and restricted widths. 
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2 Von Neumann Algorithm with Away 
Steps 

Throughout this section we assume A = [ai • • • a n ] £ R mxra w ith 
||ai|| = 1, i = 1,..., n. We next consider a variant of the von Neumann 
Algorithm that includes so-called “away” steps. To that end, at each 
iteration, in addition to a “regular step” the algorithm considers an 
alternative “away step”. Each of these away steps identifies a coor¬ 
dinate £ such that the Ath component of x k is positive and decreases 
the weight of the Ath component of Xk■ The algorithm needs to keep 
track of the support , that is, the set of positive entries of a vector. To 
that end, given x £ R", let the support of x be defined as 

S{x) := {ig{l,...,n}:ii>0}. 


Von Neumann Algorithm with Away Steps 

1. pick x 0 £ A„_i; put y 0 := Ar 0 ; k := 0; . 

2. for A; = 0,1,2,... 

if A T y k > 0 then HALT: 0 conv(A) 
j ■= argmin (a*, y k ); l ~ argmax (a*, y k ); 

ieS(xk) 

if (aj - y k ,y k ) < (yk - ae,y k ) then (regular step) 
a .— aj y k : u .— ej x kl ^max -— 1 

else (away step) 

a ■— y k u . x k Omax •— i— 

endif 

9 k := argmin {\\y k + 0a||}; 
ee[o,e ma x] 

x k -\-\ .— x k -(- 0 k u , yfc+i .— Ax k j -1 
end for 

Note that the above von Neumann Algorithm with Away Steps 
can also be applied to any non-zero matrix A = [ai • • • a n ]. The 
assumption that the columns of A are normalized, i.e. , ||aj|| = 1, i = 
1 ,...,n, simplifies our notation and exposition. In Section [3] below 
we extend our discussion to the case when the columns of A are not 
necessarily normalized. 

Observe that the iterates x k ,k = 0,1,..., generated by the above 
von Neumann Algorithm with Away Steps satisfy x k £ A n _i. This 
fact follows by induction: By construction, xq £ A„_i. At iteration k 
we have x k +\ = x k + 9 k u where x k £ A n _i and the components of u 
add up to zero as u is either ej — x k or ee~x k . The bound 9 k £ [0, 0 max ] 
in turn guarantees that x k +i > 0 and so ||a;fc+i||i = ||£fc||i = 1- 
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Define the relative width w(A) of conv(H) as 

(. Ax, a,( — a,j) 


w(A) := min max 
x>0,Ax^0 £,j 


\\M 


■ t e s(x), j e ,n} >. 


( 8 ) 

The next proposition shows that w(A) > |p(H)| when 0 £ conv(H). To 
that end, observe that w(A) can also be written as 


w(A) = min max 

x>0,Axy£0 u,v 


(Ax, Au — Av) 


: u,v £ A n _i ,S(u) C S(x) 


(9) 


Proposition 1 If A is such that 0 £ conv(H) then w(A) > |p(H)|. 
Proof: Since 0 £ conv(H), equation |3]) yields 


p(A) = max min ( z,Av) < 0. 
z6R m ,||z|| = l dGA„_i 


In particular, 


\p{A)\ 


< 


min max (z, —Av) 

zeR m ,||z||=i 

(Ax, —Av) 

nun max ———-- 

x>o,Ax^o vaA n —i Ax 


Hence from ((9|) we get 


w(A) > min max 

x>0,Ax^0 uGA n _i 


(Ax, —Av) 

ll^ll 


> \P{A)\. 


( 10 ) 


The first inequality holds because we can choose u = in ([21). The 
second inequality follows from m- ■ 

Observe that under the assumption A = [a i ■ • • a n ] £ K mxra 
with ||a,f|| = 1, i = 1,... ,n it follows that \\Au — Av\\ < 2 for all 
u,v £ A„_i. In particular, from ([9]) it follows that w(A) < 2. In 
Section 0] below we discuss some additional properties of w(A). In 
particular, we will formally prove that w(A) > 0 for any nonzero matrix 
A £ R mxn such that 0 £ conv(A). 

We are now ready to state the main properties of the von Neumann 
algorithm with away steps. 


Theorem 1 Assume xq £ A n _i is one of the extreme points of A„_i. 

(a) If 0 £ conv(A) then the iterates Xk £ A n _i ,yk = Axk, k = 
1,2,... generated by the von Neumann Algorithm with Away 
Steps satisfy 

\\y k \\ 2 < 1 llyoll 2 - 
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(b) The iterates x k £ A„_i ,y k = Axk, k = 1,2,... generated by the 
von Neumann Algorithm with Away Steps also satisfy 

llrf < | 

as long as the algorithm has not halted. In particular, if 0 ^ 
conv(A) then the von Neumann Algorithm with Away Steps finds 
a certificate of infeasibility A T y k > 0 for 0 ^ conv(A) in at most 
t4v2 iterations. 

p(A) 2 

The crux of the proof of Theorem [T] is the following elementary lemma. 


Lemma 1 Assume a,y £ R m satisfy ( a,y) < 0. 

nm||y + H 2 = IMI 2 -^pp 

and the minimum is attained at 9 = — ■ 


Then 

2 


Proof of Theorem [lj 

(a) The algorithm generates y k +i by solving a problem of the form 
\\y k+1 \\ 2 = min \\y k + 6a\\ 2 

(s G [O,0 max J 

where a = aj — y k or a = y k — az is chosen so that (a, y k ) = 
min{ (y k - az,y k ), (aj - y k ,y k )}. In particular, 

~(a,yk) > \{(az-y k ,y k ) + (y k ~ aj,y k )) 

= i (at - aj,y k ) (11) 

> \w(A)\\y k \\. 

If 9 k < 0 max then Lemma [[] applied to y := y k yields 

ll ^ +1 || 2 = llyfcll 2 - < \\yk \\ 2 - ^f\\y k \\ 2 . 

The second inequality follows from (HU) and |H| < l + \\y k \\ < 2. 
Thus each time the algorithm performs an iteration with 9 k < 
9max, the value of || 2 /fc|| 2 decreases at least by the factor 1— • 

To conclude, it suffices to show that after N iterations the number 
of iterations with 9 k < 0 max is at least N/2. To that end, we apply 
the following argument from [17] : Observe that when 9 k = 0 max 
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we have |S(xfc+i)| < | S' (a; fe) | - On the other hand, when Ok < 0 m ax 
we have |5(a;fe + i)j < |S'(a;fc)j + l. Since |5(xo)| = 1 and |«S(a;)| > 1 
for every x £ A„_i, after any number of iterations there must 
have been at least as many iterations with Ok < 0 max as there 
have been iterations with Ok = 0 max . Hence after N iterations, 
the number of iterations with Ok < 0 max is at least N/2. 

(b) Proceed as above but note that if the algorithm does not halt at 
the fc-th iteration then {a,yk) < (dj —ykiVk) < — ||yfc|| 2 - Thus 
each time the algorithm performs an iteration with Ok < 0 max , 
we have 

\\yk+i\\ 2 < \\yk\\ 2 < hkW 2 (12) 

Assume the algorithm has not halted after N iterations. Let m 
be the number of iterations with Ok < 0 max up to iteration N. If 
11 2/tv 11 2 < ^ and 0 N < 0 max then from (HU we get 


IIz/at+iII 2 < — - 


4 (m — 1) 


< 


It follows by induction that if the algorithm has not halted after 
N iterations then 11 j/at11 2 < 777 - As in part (a), it must be the 
case that m > A and consequently ||yjv || 2 < jj- Finally, if 0 ^ 
conv(A) then p(A) = min{||y|| : y £ conv(A)} > 0 and so the 
algorithm must halt with a certificate of infeasibility A T yk > 0 
for 0 ^ conv(A) after at most p ^yj iterations. 


3 Frank-Wolfe Algorithm with Away Steps 

Throughout this section assume A = [ai • • • a„] € j(> m x ra j g a non _ 

zero matrix, and f(y) = — ( y,Qy ) + {b,y) for a symmetric positive 
definite matrix Q £ R mxm an d 5 g Consider the problem 

min■ f(y)<& min f(Ax). (13) 

yGconv(A) 

Observe that in contrast to Section [ 2 ] we do not assume that the 
columns of A are normalized in this section. 

Problem m can be seen as a special case of (1131) when Q — I and 
b = 0. The von Neumann Algorithm can also be seen as a special 
case of the Frank-Wolfe Algorithm :S] for m- This section extends 
the ideas and results from Section [2] to the following variant of the 
Frank-Wolfe algorithm with away steps. This variant can be traced 
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back to Wolfe [26]. It has been a subject of study in a number of 

papers m mamma mi • 

Frank-Wolfe Algorithm with Away Steps 

1. pick x 0 £ A„_i; put y 0 := Ax 0 ; k := 0; . 

2 . for fc = 0,1, 2 ,... 

j := argmin {a u Vf{y k )); t := argmax (a,, V/(t/ fe )}; 

i^S(xk) 

if (aj - yk-yf(yk)) < (yk - at, V/(y fc )) then (regular step) 
a ■— dj yk, a .— €j X k , ^max ■— 1 
else (away step) 

a:=y k - at; u := x k - e £ ; 0 max := 
endif 

9k '■= argmin f(y k + 6a) 

0G[O,0 max ] 

% k -\-1 •— T 9 k u, y k -\-i •— Ax k -\-i 

end for 

Observe that the computation of 6 k in the second to last step reduces 
to minimizing a one-dimensional convex quadratic function over the 
interval [O,0 max ]. 

We next present a general version of Theorem[l]for the above Frank- 
Wolfe Algorithm with Away Steps. The linear convergence result de¬ 
pends on a certain restricted width and diameter defined as follows. 
For x > 0 with Ax ^ 0 let 


(p(A,x) := 


sup -j A > 0 : 3u, v £ A n _i, S(u) C S(x), Au — Av = Ax f . 


Define the restricted width (f>(A) and diameter d(A) of conv(A) as 
follows. 


4>{A) := min {(j>(A, x) : x > 0, Ax ^ 0} , (14) 

X 

and 

d(A) := max ||Ax — Au\\. (15) 

x,uGA n _i 

Observe that for x > 0 with Ax ^ 0 

<f>{A,x) < nm(— : u ’ v e A n ~i,S(u) C S(x)\ . 

u,v ( ||Ax|| J 

Thus ([h]) and (fl4l) imply that w(A) > (f>(A) for all nonzero A £ K. mxn . 
Furthermore, the restricted width <p(A) can be seen as an extension 
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of the radius p(A) defined in ([2]). Indeed, when 0 £ int(conv(A)), we 
have span(A) = M m . Hence (j5j) can alternatively be written as 

|p(A)| := min max { A : £ A n _i, —Av = 77 — ; — 77 Ax 1 . 

x>o,Ax^o [ ||Ax|| J 

This implies that <j>(A,x) > \p(A)\ + for all x > 0 with Ax A 0. 
Hence the following inequality readily follows 

H A ) > ^(bi¬ 
section |4] presents a stronger lower bound on <f>(A) in terms of 
certain variants of p(A). In particular, we will show that <f>(A) > 0, 
and consequently w(A) > 0, for any nonzero matrix A £ K. mxn such 
that 0 £ conv(A). 

The linear convergence property of the von Neumann algorithm 
with away steps, as stated in TheoremQJa), extends as follows. 

Theorem 2 Assume x* £ A„_i is a minimizer of G3D- Let y* = 
Ax* and A := Q 1 / 2 [asi — y* ■ ■ ■ a n — y*] . If Xq £ A„_i is one of 
the extreme points of A„_i then the iterates x^ £ A n _ 1; yf. = Axk, k = 
1, 2, ... generated by the Frank-Wolfe Algorithm with Away Steps sat¬ 
isfy 

f(y*) f(y*) < (i - ^=4) (/M - f(y*))- (ie) 

The proof of Theorem [2] relies on the following two lemmas. The 
first one is similar to Lemma Q] and also follows via a straightforward 
calculation. 


Lemma 2 Assume f is as above and a,y £ satisfy (a, V/(j/)) < 0. 
Then 


min f(y + da) = f{y) 


2 (a, Qa) 


and the minimum is attained at 6 = — 


(°,v/(y)> 

( a,Qa ) 


Lemma 3 Assume f,A,y*,A are as in Theorem [H above. Then for 
all x £ A„_i 


max (V f (Ax), at - a,) > <j>(A) y/2(f (Ax) - f(y*)). 

ZES(x),j=l,...,n 

Proof: Let y := Ax £ conv(A). Assume y y* as otherwise there is 
nothing to show. Since y* minimizes (11311 . we have (Vf(y*),y- y*) > 
0. For ease of notation put 5 := (V f(y*),y — y*) and ||y — 2 /*||q := 
(y — y*,Q(y — y*)) ■ It readily follows that 

{vm,v-v*) = \\v-v*\\Q + s>o, 
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and 

2(f(y)-f(y*)) = \\y-y% + 26. 

Hence, 

(Vf(y),y-y *) 2 = (\\y-y% + S ) 2 

>\\v-v%{\\v-y*\\Q + M) 

= 2|| y-y%(f{y)-f{y*)). 

Thus 

j. ^( /M -/(„.)) . (17) 

lly-rlk " 

On the other hand, by the definition of <j>{A) there exist u, v £ A n _i 
with S(u) C £(x) and A > <j>(A ) such that Hu — Hu = p^j-Hx. Since 

Ax = Q l t 2 (Ax — y*) = Q l t 2 (y— y*), the latter equation can be rewrit¬ 
ten as 

Au-Av = - - ^-(y-y*). (18) 

\\y-y \\q 

Putting (fT71) and (fTSl) together we get 

(V/(y), An - Av) = A ^ > <KA)y/2(f(y)-f(y*)). 

To finish, observe that 

max (V/(Hx), at — a,j) > (Vf(y), Au — Av) 

££S(x),j=l,...,n 

> 0(H)\/2 (f{Ax) - f(y*)). 


Proof of Theorem [2} This is a modification of the proof of Theo¬ 
rem [Tf a). At iteration k the algorithm yields yk+i such that 

f(Vk+ 1 ) = „ pin f{y k + 9 a) 

0e[ O.flmax] 

where a = aj — y k or a = y k — at, and 

- (V/(t/fc), a) > i (V/(y fe ),a* - aj) > ^cj)(A)y'2(f(y k ) - /(y*). 

The second inequality above follows from Lemma [3] If 9 k < 9 m ax then 
Lemma [5] applied to y := y k yields 

f(yk+i) = f(yk) - ~ tlfiw (/(^) - /(y*))- 

2 (a, Qa) 4a(nl) 2 
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That is, 


f(y k+ i) - f(y*) < (1 - (/(I/*) - f(y*))- 

Then proceeding as in the last part of the proof of Theorem QJ a) we 
obtain (flG|) . ■ 

Remark 1 A closer look at the proof of Theorem [H reveals that the 
convergence bound msD can be sharpened as follows: Replace (f>(A) 
with Wf(A) > (f>(A), where Wf(A) is the following extension of w(A): 


w f (A) := 


( (Vf(Ax),at-aj) 
nn max < — , = 

t-. 1 e ' j I V2 (f(Ax) - f{y*)) 


mm 

x £ ‘ 

Ax ^ y* 


: £ £ S(x),j £ {1, • • • ,n} 


In the special case when Q = I, b = 0 problem (USD specializes to 
problem ©■ In this case if 0 £ conv(A) then we have y* = 0 and 
Wf(A) = w(A). Hence the sharpened version of Theorem^ yields 


< 1 - 


w(A) 2 

AdfAf 


k/2 


llz/o|| 2 . 


If in addition the columns of A are normalized then d(A) < 2 and we 
recover the bound in Theorem EM- 


We have the following related conjecture concerning w(A) and <f>(A). 


Conjecture 1 If A £ K mxn non-zero and 0 £ conv(A) then (f>(A) = 
w(A). 


4 Some properties of the restricted width 

Throughout this section assume A £ R mxra is a nonzero matrix. As 
we noted in Section [3] above, w(A ) > (j>{A) and 4>(A) > \p(A)\ when 
0 £ int(conv(A)). Our next result establishes a stronger lower bound 
on 4>(A) in terms of some quantities that generalize p(A) to the case 
when 0 £ dconv(A). To that end, we recall some terminology and 
results from [5]. Assume A = [a\ ■■■ a n ] £ R mxra is a non-zero 

matrix. Then there exists a unique partition BUlV = {1, ..., n} such 
that both A b xb = 0, x B > 0 and A^y > 0, A^y = 0 are feasible. In 
particular, B ^ 0 if and only if 0 € conv(A). Also N ^ 0 if and only 
if 0 ^ relint(conv(A)). Furthermore, if oq = 0 then i £ B. 

The above canonical partition ( B,N ) allows us to refine the quan¬ 
tity p(A) defined by © as follows. Let L := span(As) and := {v £ 
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R m : (v, y) = 0 for all y £ L}. By convention, L = {0} and L x = K m 
when B = 0. If L ^ {0}, let pb(A) be defined as 

Pb(A) := max min(aj.z). 
ze£,||z||=i ies 

Observe that if B ^ 0, then L = {0} only when a* = 0 for all i £ B. 

If iV ^ 0, let pn{A) be defined as 

Pn(A) := max min(a,,z). 
zei- L ,||z||*i ieJV 

When L ^ {0}, it can be shown [5] that pb{A) < 0. Likewise, when 
TV ^ 0 it can be shown that pn{A) > 0. In particular, the latter 
implies that 

Pn(A) := max min {a,i,z)= max min (aj~,z), (19) 

zei J -,||z||=l i&N ' z&L±,\\z\\<l i&N ' 7 V 

where a x is the orthogonal projection of Oj onto L ± . Let denote 
the matrix obtained by projecting each of the columns of A^ onto 
From (flT)l) and Lagrangian duality it follows that 

Pn(A) = min{||y|| : y £ conv(A^)}. (20) 

Similarly, it can be shown that if L ^ {0} then 

\pb(A)\ = max{r :y £ L, ||y|| <r^y£ conv(^ B )}. (21) 

Observe that (12(71) and (1211) nicely extend © and (0. Indeed, (l20l) 
is identical to © when B = 0. Likewise, m is identical to © 

when TV = 0 and rank(A) = m. Furthermore, CU1) and (THT) imply 

that pn{A) = dist(0, <9conv(A^)) and \pb{A)\ = dist B (0, <9conv(A B )) 
thereby extending the fact that |p(^4)| = dist(0, <9conv(A)). 

The next results show that <j>(A) can be bounded below in terms 
of pb{A) and pn{A). In particular, Corollary Q] shows that w(A) > 
4>(A) > 0 whenever A ^ 0 and 0 £ conv(A). 

Theorem 3 Assume A = [a i • • • a n ] £ M mxra is a nonzero ma¬ 
trix. 

(a) If TV = 0 then L ^ {0} and f{A) > \pb(A)\. 

(b) If B = 0 then (j)(A) > Pn(A) for A := [A 0] . 

(c) If B A® and L = {0} then 4>{A) > pn(A). 

(d) // TV ^ 0 and L A {0} then <j)(A) > - =. where 

^/\\A\\ 2 + Pn (A) 2 




Proof: 


(a) Assume x > 0 is such that y := Ax ^ 0. In this case y £ 
span(As) = L. Hence L ^ {0} and by (I2T1) there exists v £ 


A„_i and r > \ps(A)\ such that — Av = jj^Ax. Thus for 


( 


:= we have u,v £ A n _i, S(u) C S(x) and Au — Av = 
r+ irff) TP&1T Ax - !t follows that x) > r+|^ > \p B {A)\. 


(b) Assume x := 


it follows that 


> 0 is such that y := Ax = Ax ^ 0. From (l20l) 


1QHI 


> pn(A). Thus for u := 


iFlIi 

0 


we have u, v £ A n _i, S(u) C S(x) and Au — Av = 
It follows that 4>(A,x) > 144 > pn(A). 


V •— &n- 1-1 


Ax\\ 


x\\i ||Ax 


r Ax. 


(c) Since B ^ 0 and L = {0}, it follows that Ab = 0 and the columns 

of An are precisely the non-zero columns of A. Thus from part 
(b) we get (j) ( [A n 0]) > pjv(A). To finish, observe that </>(A) = 
4>([An 0]) because Ab = 0. 

(d) Assume x > 0 is such that y := Ax ^ 0. Let L := span(As) 

and decompose y = yL + y± where y± = Aj^XN £ L x and 
Vl = A b x b + ( A n - Ajf)x N £ L. Put r := £ [0,1]. As¬ 

sume r > 0 as otherwise y = yL £ span(AB) and the state¬ 
ment holds with the better bound 4>(A) > \ps(A)\ by proceeding 
exactly as in part (a). Since r > 0, we have Xn 4 0. Put 
r N '■= ir^ir- From (BUI) it follows that r/v > pn(A). Next, 
put w := Tl^lT - (( An ~ A n)x n - y L )- Observe that ||w|| < 


max \\ai — a, 
ieN 


\\vl\\ 


< 


rpfV 1 — r 2 


and w £ L. Hence 


Ikivlli r 

by (EU) there exists x B > 0 , ||is||i = 1 such that A B x B = cw, 
where 

|WJ(A)I ’' £( 0 , 1 ). 


c := 


r||A|| + vnV 1 - r 2 


Taking x N := we get 


AnXn—Abxb = 


XiV 1 


■{vx + Vl) = 


\p B (A)\r N y 
r||A|| + rjvVl - r 2 \\y\\ 


Thus letting u := (1 — c)x + (0,Xn), v = {x B , 0) we get u, v £ 
A n _i, S(u) C S(x) and 


Au — Av = ( (1 — c)||Ax|| + 


\p B {A)\r N 


Ax 


+ rWl - r 2 J ||Ax|| 


( 22 ) 
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Next, observe that 


(1 - c)||Ax|| + 


\p B (A)\r N 

r\\A\\ + vnV 1 — r 2 


> 

> 


Ipb(A)|t-jv 
r||A|| + rjv \/1 — r 2 
|PB(^)kjV 


vW+ 


r* 


2 

AT 


> Ipb {A)\pn(A) 

- VII A\\* + p N (A)*' 

(23) 

The first inequality above follows because c € (0,1), the second 
one follows from 


max (r||A|| +rjvVl - r 2 ) = \J\\A\\ 2 + r 2 N , 


and the third one follows from r^ > pw(A). Putting (E^l) and (B3l) 

together we get 4>{A, x) > -j==i^==. ■ 

^\\A\\ 2 + p N (A) 2 

Corollary 1 Assume A = [ai • • • a n ] € R mxra j, s a nonzero matrix 
and 0 E conv(A). TTien w(A) > <f(A) > 0. 


Proof: Apply Theorem[3] Since 0 € conv(A), we have B A 0 and thus 
case (b) cannot occur. If case (a) occurs then <j)(A) > \pb{A)\ > 0 since 
Pb{A) < 0 as L A {0}. If case (c) occurs then 4>(A) > pn{A) > 0. 

Finally, if case (d) occurs then MA) > — 1 — L ————= > 0 , since 

vW+M^F 


both pn(A) > 0 and pb{A) < 0 as L A {0}. To finish, recall that 
w(A) > <f>{A) as established in Section [3] ■ 


We conclude with a few small examples that illustrate the values of 
<j>(A), \pb(A)\, pn{A) and their connection with the bounds in Theo¬ 
rem [3] for the three possible cases: N = 0, B = 0, and both B,N ^ 0. 

Example 1 Assume e,<5 € (0,1) and let 


-1 1 - 11-11 
—e —e e e eS eS 


In this case B = {1,2,3,4,5,6}, N = 0. It is easy to see that 
\pb(A)\ = e and 4>(A) = (f>(A, x) = (l+6)e for x = [0 0 0 0 1/2 1/2] T . 

Example 2 Assume S € (0,1) and let A = ^ 

5 = 0, N = {1,2}. It is easy to see that Pn(A) 

A = [A 0] then 4>{A) = <f>{A,x) = S for x = [l/2 


-1 

S 


. In this case 


= S and if we put 
1/2 Of. 
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Example 3 Assume e,S € (0,1) and let 


-1 


A = 


—e 


0 


1 -110 0 
-e e e 1 -1 

0 0 0 S S 


In this case B = {1,2,3,4}, N 

\pb{A)\ = e, pn{A) = 6. For x = 
we get 


= {5,6}. It is easy to see that 

0 0 2 (l+£) 2 (l+e) ^ 1 +e 


Ax = 


0 ' 

0 

eS 


Ll + eJ 


It thus follows that <t>(A) < <f>(A,x) = On the other hand, Theo¬ 
rem^ implies that in this case d>(A) > . ' In partic- 

^ ' ~ v /max(l+e2,l+52)+i52 ^ 


ular, e<5 < 4>{A) < 2eS. 
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